Skip to content

fix(connectors): harden 10 KB connectors after audit#4410

Merged
waleedlatif1 merged 20 commits intostagingfrom
waleedlatif1/connector-audit-fixes
May 3, 2026
Merged

fix(connectors): harden 10 KB connectors after audit#4410
waleedlatif1 merged 20 commits intostagingfrom
waleedlatif1/connector-audit-fixes

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

@waleedlatif1 waleedlatif1 commented May 2, 2026

Summary

  • jira: migrate from deprecated /rest/api/3/search (Atlassian sunset May 2025) to /rest/api/3/search/jql with nextPageToken pagination
  • confluence: unify stub hash across v1 CQL (when) and v2 (createdAt) paths via shared pageToStub helper using version.number
  • salesforce: replace hardcoded login.salesforce.com userinfo with host fallback so sandbox-issued tokens (test.salesforce.com) work
  • servicenow: validate sys_id against /^[a-f0-9]{32}$/ and switch getDocument to path-based /api/now/table/{table}/{sys_id} to close encoded-query injection; reject ^ in kbCategory
  • zendesk: URL-encode Search API query via URLSearchParams; whitelist ticket statuses; encode locale path segment
  • github: add 10MB cap and /git/blobs/{sha} fallback for files >1MB that /contents/ returns with encoding:"none"
  • slack: replace SHA-256 over formatted-message window with metadata hash slack:{channelId}:{oldestTs}:{latestTs}:{count} so list and getDocument agree; cache auth.test team_id on syncContext
  • obsidian: drop syncRunId from stub hash (Local REST API has no HEAD/Last-Modified per OpenAPI spec); fall back to path-only stub
  • evernote: title fallback for attachments-only notes — breaks infinite hydration loop where empty plaintext returned null
  • google-docs: drop residual error instanceof Error pattern

Each fix was validated against the provider API docs by an independent subagent before applying. Confluence and Slack hash format changes self-heal with a one-time re-sync; no data loss.

Type of Change

  • Bug fix

Testing

Tested manually. Lint and typecheck pass.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

waleedlatif1 and others added 2 commits May 2, 2026 13:05
Validated each issue against provider docs before fixing.

- jira: migrate from deprecated /rest/api/3/search (Atlassian sunset
  May 2025) to /rest/api/3/search/jql with nextPageToken pagination
- confluence: unify stub hash across v1 CQL (`when`) and v2
  (`createdAt`) paths via shared pageToStub helper using version.number
- salesforce: replace hardcoded login.salesforce.com userinfo with
  host fallback so sandbox-issued tokens (test.salesforce.com) work
- servicenow: validate sys_id against /^[a-f0-9]{32}$/ and switch
  getDocument to path-based /api/now/table/{table}/{sys_id} to close
  encoded-query injection; reject `^` in kbCategory filter
- zendesk: URL-encode Search API query via URLSearchParams; whitelist
  ticket statuses; encode locale path segment
- github: add 10MB cap and /git/blobs/{sha} fallback for files >1MB
  that /contents/ returns with encoding:"none"
- slack: replace SHA-256 over formatted-message window with metadata
  hash slack:{channelId}:{latestTs}:{count} so list and getDocument
  agree; cache auth.test team_id on syncContext
- obsidian: drop syncRunId from stub hash (Local REST API has no
  HEAD/Last-Modified per OpenAPI spec); fall back to path-only stub
  so engine two-stage check short-circuits unchanged notes
- evernote: title fallback for attachments-only notes — breaks
  infinite hydration loop where empty plaintext returned null
- google-docs: drop residual `error instanceof Error` pattern

Confluence and Slack hash format changes self-heal with a one-time
re-sync; no data loss.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… edits

Audit caught that the metadata hash slack:{channel}:{latestTs}:{count}
misses edits to the oldest message in the rolling window — count and
latestTs both stay constant. Adding oldestTs catches window-shift
when a new message arrives and pushes the oldest out without changing
the newest.

Also drop residual error instanceof Error pattern in validateConfig.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 3, 2026 0:20am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 2, 2026

PR Summary

Medium Risk
Touches multiple production connectors and changes pagination/content hashing behavior, which could trigger one-time re-syncs or missed/extra documents if provider APIs behave unexpectedly. Also adds new validation/escaping paths that may inadvertently filter out previously-synced items if edge-case inputs relied on looser handling.

Overview
Improves connector robustness across Confluence, Jira, GitHub, Slack, Salesforce, ServiceNow, Zendesk, Evernote, Google Docs, and Obsidian.

Key changes: Confluence now generates a canonical stub via pageToStub so contentHash/metadata are consistent across v1 CQL and v2 listing/get flows, and config validation uses shared retry options. Jira migrates listing/validation to /rest/api/3/search/jql with nextPageToken pagination and a cursor format that preserves collected counts for maxIssues, plus better ADF text extraction (hardBreak, mention, emoji).

Hardening + correctness: GitHub skips >10MB files, detects binary blobs, and falls back to the Blobs API when /contents returns encoding:"none"; branch URLs are encoded safely for slashes. Slack switches channel contentHash from hashing formatted text to a deterministic metadata-based fingerprint (including edits/thread signals), caches team ID lookups, and filters only known “noise” message subtypes while excluding DM IDs. Salesforce adds sandbox-compatible userinfo host fallback with caching and encodes record IDs in fetch URLs; ServiceNow blocks encoded-query/sys_id injection by validating sys_id, sanitizing filters, and using the path-based record fetch; Zendesk URL-encodes locale/query params, whitelists ticket statuses (with Search API cap warning), and exposes ticket priority as a tag. Evernote now falls back to note title when extracted plaintext is empty; Obsidian stubs use stable path-only hashes; several connectors standardize error handling via toError(...).

Reviewed by Cursor Bugbot for commit 75f0b1a. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 2, 2026

Greptile Summary

This PR hardens 10 connector implementations after an audit, addressing a Jira API deprecation (sunset May 2025), injection risks in ServiceNow queries, sandbox token failures in Salesforce, hash instability in Confluence/Slack, large-file handling in GitHub, and several smaller content/validation fixes across Evernote, Zendesk, Obsidian, and Google Docs. All changes are well-documented and include defensive logging; Confluence and Slack content hashes will change on first deploy, triggering a one-time re-sync as noted in the PR description.

Confidence Score: 5/5

Safe to merge — no P0 or P1 bugs found; all fixes are well-reasoned and include defensive logging.

All 11 files were reviewed in depth. Jira cursor logic correctly uses lastIndexOf to handle tokens containing the separator character. Obsidian auth validation correctly reads the authenticated body field inside an outer try-catch. ServiceNow injection prevention is sound. Slack hash change and Confluence cursor restart are intentional, documented, and self-healing. No regressions identified.

No files require special attention. The Confluence and Slack contentHash format changes will trigger a one-time full re-sync on first deploy, which is expected and noted in the PR description.

Important Files Changed

Filename Overview
apps/sim/connectors/jira/jira.ts Migrates search from deprecated /rest/api/3/search to /rest/api/3/search/jql with nextPageToken cursor; collected-count embedded in cursor string ensures cap works without syncContext; lastIndexOf('
apps/sim/connectors/confluence/confluence.ts Unifies v1 CQL and v2 content hash via shared pageToStub; old bare-string cursors now log a warning and restart cleanly rather than silently misaligning blogpost pagination; validateConfig retries correctly passed to getConfluenceCloudId.
apps/sim/connectors/salesforce/salesforce.ts Adds sandbox support via fetchUserinfo host-fallback loop; caches working host in syncContext; IsLatestVersion=true filter added to KnowledgeArticleVersion SOQL; encodeURIComponent applied to externalId in path URL.
apps/sim/connectors/servicenow/servicenow.ts Validates sys_id against 32-char hex pattern before use; switches getDocument to path-based API endpoint to eliminate encoded-query injection; input validation added for kbCategory, workflowState, incidentState, incidentPriority.
apps/sim/connectors/slack/slack.ts Replaces SHA-256-of-formatted-content hash with stable metadata hash that includes edit/reply/thread signals; caches auth.test team_id on syncContext; DM channels correctly excluded from direct-ID matching; noise-subtype set is more precise than the old allowlist.
apps/sim/connectors/github/github.ts Adds 10 MB cap during listing and getDocument; blob-API fallback for 1-10 MB files with binary detection via NUL-byte scan; branch path segments now correctly percent-encoded with / preserved; 403 from Contents API returns null instead of throwing.
apps/sim/connectors/zendesk/zendesk.ts Search API query built via URLSearchParams; status filter whitelisted against VALID_TICKET_STATUSES; Search API 1000-result cap warned on; locale path segment encoded; priority tag definition and mapping added.
apps/sim/connectors/obsidian/obsidian.ts Stub hash simplified to obsidian:{filePath} (dropping volatile syncRunId); auth validation now reads authenticated field from response body rather than relying on status codes that never fire per the API spec.
apps/sim/connectors/evernote/evernote.ts Attachment-only notes (empty plaintext) now fall back to title as content instead of returning null, breaking the infinite hydration loop; toError pattern applied to validateConfig.
apps/sim/connectors/google-docs/google-docs.ts Text extraction now strips trailing newlines per paragraph and joins parts with newline, producing equivalent output to the previous approach; error instanceof Error pattern replaced with toError.
apps/sim/tools/jira/utils.ts Adds optional RetryOptions to getJiraCloudId so validate path can pass VALIDATE_RETRY_OPTIONS; ADF text extraction extended to handle hardBreak, mention, and emoji node types.

Reviews (13): Last reviewed commit: "fix(slack): align validateConfig channel..." | Re-trigger Greptile

Comment thread apps/sim/connectors/github/github.ts
Comment thread apps/sim/connectors/salesforce/salesforce.ts Outdated
Comment thread apps/sim/connectors/zendesk/zendesk.ts Outdated
- jira: thread VALIDATE_RETRY_OPTIONS through getJiraCloudId so the
  validate path uses the tighter retry budget
- confluence: thread VALIDATE_RETRY_OPTIONS through getConfluenceCloudId;
  drop residual error instanceof Error pattern
- zendesk: log a warning when statusFilter is set and limit exceeds
  the Search API 1000-result cap; add priority to tagDefinitions and
  mapTags so the metadata isn't orphaned

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- github: skip blob fetch for empty 0-byte files
- salesforce: add fallback message for empty error
- zendesk: warn on invalid statusFilter instead of silent fallthrough
- evernote: remove dead content.trim() guard
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

Also addressed greptile's Evernote content.trim() dead-guard finding in f8f033e — removed the unreachable check since content is already guaranteed non-empty by the title fallback (which itself defaults to 'Untitled').

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/connectors/zendesk/zendesk.ts Outdated
Comment thread apps/sim/connectors/zendesk/zendesk.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/connectors/obsidian/obsidian.ts
- servicenow: case-insensitive sys_id pattern; validate workflowState/incidentState/incidentPriority as numeric, kbCategory via allowlist
- salesforce: include 400 in userinfo host fallthrough for sandbox tokens
- zendesk: add missing 'new' and 'hold' options to ticketStatus dropdown
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/connectors/servicenow/servicenow.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

Switch from a \w-based allowlist (ASCII only) to a denylist that
rejects only encoded-query-meaningful characters (^, control chars,
quotes). International category names like 'Général' or 'Ação' are
now accepted.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/connectors/servicenow/servicenow.ts Outdated
Wiki-template KB articles populate the wiki field and leave text empty;
HTML-template articles do the opposite. Removing wiki caused wiki-format
articles to resolve to empty content and be silently skipped.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/connectors/jira/jira.ts Outdated
Comment thread apps/sim/connectors/google-docs/google-docs.ts
- jira: rely solely on absence of nextPageToken for end-of-results;
  data.isLast on /rest/api/3/search/jql is unreliable (JRACLOUD-95477)
  and the OR-with-isLast logic could truncate pagination early
- google-docs: strip trailing newline from each paragraph before
  joining so heading->body produces a single newline, not two
data.isLast on /rest/api/3/search/jql is unreliable (JRACLOUD-95477)
and the previous OR-with-isLast logic could truncate pagination early
if Jira ever returned isLast=true alongside a valid nextPageToken.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 71f6256. Configure here.

DM (D...) channel IDs require im:*/mpim:* OAuth scopes, which the
connector does not request. The regex now matches only public (C) and
private (G) channel IDs to fail fast with a clearer error instead of
hitting missing_scope from Slack.
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/connectors/slack/slack.ts
The previous fix narrowed the regex in resolveChannel to [CG] but missed
the duplicate in validateConfig, which still admitted DM IDs and would
fall through to a name-based search returning 'Channel not found.'
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 75f0b1a. Configure here.

@waleedlatif1 waleedlatif1 merged commit 64642d4 into staging May 3, 2026
13 of 14 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/connector-audit-fixes branch May 3, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant